Data Exploration and Cleaning Exercise
Load demo.xlsx dataset
Rename the columns as suggested below
Old name New name Age age Gender gender Marital Status marital_status Address address Income income Income Category income_category Job Category job_category Display all the columns in the dataset
Display some basic statistics about the numeric variables in the dataset
Display some basic statistics about the categorical variables in the dataset
What are the unique observations under gender?
Can you fix any problems observed under the gender, give brief explanations why and how
How many observations have ‘no answer’ for marital status?
Write some piece of code to return only numeric variables from the dataset
Are there any missing values in the dataset?
Are there any outliers in the income variable?
Investigate the relationship between age and income
How many people earn more than 300 units?
What data type is the marital status?
Create dummy variables for gender